Index-Based Persistent Document Identifiers

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Representing Document Lengths with Identifiers

The length of each indexed document is needed by most common text retrieval scoring functions to rank it with respect to the current query. For efficiency purposes information retrieval systems maintain this information in the main memory. This paper proposes a novel strategy to encode the length of each document directly in the document identifier, thus reducing main memory demand. The techniq...

متن کامل

Can Persistent Identifiers Be Cool?

The fast growth of scientific and non-scientific digital data, as well as the proliferation of new types of digital content, has led – among many other things – to a lot of innovative work on the concept of the identifier. Digital identifiers have become the key to preserving and accessing content, just as physical identifier tags have been the key to accessing paper-based content and other phy...

متن کامل

Implementing Persistent Identifiers. Overview of concepts, guidelines and recommendations

Traditionally, references to web content have been made by using URL hyperlinks. However, as links are 'broken' when content is moved to another location, a reference system based on URLs is inherently unstable and poses risks for continued access to web resources. To create a more reliable system for referring to published material on the web, from the mid-1990s a number of schemes have been d...

متن کامل

Assigning Document Identifiers to Enhance Compressibility of Fulltext Indices

Index compression has been a major issue in the field of Information Retrieval Systems. In particular, due to the impressive figures involved with Web Search Engines (WSEs) the compression of the index is not an option anymore but it has become a must. The most important index compression methods are designed to work for Inverted File (IF) indexes. These methods are based on the assumption that...

متن کامل

Phrase-based Document Similarity Based on an Index Graph Model

Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the underlying data model should be able to represent the phrases in the document as well as single terms. We present a novel data model, the Document Index Graph, which indexes web documents based on phrases, rather than sing...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Retrieval

سال: 2005

ISSN: 1386-4564

DOI: 10.1023/b:inrt.0000048494.05013.6a